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ENCODING, DECODING AND 
COMPRE SSION OF AUDIO-TYPE DATA 
USING REFERENCE COEFFICIENTS 
LOCATED WITHIN A BAND A 

COEFFICIENTS 5 

RELATED APPLICATION 

This is a continuation of application Sex. No. 07/879,635 
filed on May 7. 1992, now US. Pat No. 5369.724. which 
is a continuation-in-part of Scr. No. 08/822,247, filed Jan. 10 
17, 1992, now U.S. Pat No. 5394,508. 

The present invention relates generally to the field of 
signal processing, and more specifically to data encoding 
and compression. The invention relates most specifically to 
a method and an apparatus for the encoding and compres- 13 
sion of digital data representing audio signals or signals 
generally having the characteristics of audio signals. 

BACKGROUND OF THE INVENTION 

20 

Audio signals are ubiquitous. They are transmitted as 
radio signals and as part of television signals. Other signals, 
such as speech, share pertinent characteristics with audio 
signals, such as the importance of spectral domain repre- 
sentations. For many applications, it is beneficial to store ^ 
and transmit audio type data encoded in a digital form, rather 
than in an analogue form. Such encoded data is stored on 
various types of digital media, including compact audio 
discs, digital audio tape, magnetic disks, computer memory, 
both random access (RAM) and read only (ROM), just to M 
name a few. 

It is beneficial to minimize the amount of digital data 
required to adequately characterize an audio-type analogue 
signal. Minimizing the amount of data results in minimizing 
the amount of physical storage media that is required, thus 35 
reducing the cost and increasing the convenience of what- 
ever hardware is used in conjunction with the data. Mini- 
mizing the amount of data required to characterize a given 
temporal portion of an audio signal also permits faster 
transmission of a digital representation of the audio signal 40 
over any given communication channel. This also results in 
a cost saving, since compressed data representing the same 
temporal portion of an audio signal can be sent more 
quickly, relative to uncompressed data, or can be sent over 
a communications channel having a narrower bandwidth, 45 
both of which consequences are typically less costly. 

The principles of digital audio signal processing are well 
known and set forth in a number of sources, including 
Watkinson, John, The An of Digital Audio., Focal ftess, 
London (1988). An analogue audio signal x(t) is shown so 
schematically in FIG. 1. The horizontal axis represents time. 
The amplitude of the signal at a time t is shown on the 
vertical axis. The scale of the time axis is in ixulliseconds, so 
approximately two thousandths of a second of audio signal 
is represented schematically in FIG. 1. A basic first step in 55 
the storage or transmission of the analogue audio signal as 
a digital signal is to sample the signal into discrete signal 
elements, which will be further processed. 

Sampling the signal x(t) is shown schematically in FIG. 2. 
The signal x(t) is evaluated at many discrete moments in 60 
time, for example at a rate of 48 kHz. By sampling, it is 
meant that the amplitude of the signal x(t) is noted and 
recorded forty-eight thousand times pa second. Thus, for a 
period of one msec (lxMT 3 sec), the signal x(t) will be 
sampled forty-eight times. The result is a temporal series 65 
x(n) of amplitudes, as shown in FIG. 2, with gaps between 
the amplitudes for the portions of the analogue audio signal 



x(t) which were not measured If the sampling rate is high 
enough relative to the time-wise variations in the analogue 
signal, then the magnitudes of the sampled values will 
generally follow the shape of the analogue signaL As shown 
5 in FIG. 2, the sampled values follow signal x(t) rather welL 
The outline of a general method of digital signal process- 
ing is shown schematically in FIG. 4a. The initial step of 
obtaining the audio signal is shown at 99 and the step of 
sampling is indicated at 102. Once the signal has been 
10 sampled, it is typically transformed from the time domain, 
the domain of FIGS. 1 and 2, to another domain that 
facilitates analysis. Typically, a signal in time can be written 
as a sum of a number of simple harmonic functions of time, 
such as coscot and sincat, for each of the various harmonic 
15 frequencies of (a The expression of a time varying signal as 
a series of harmonic functions is treated generally in 
Feynman, R., Leighton, R., and Sands, M., The Feynman 
Lectures on Physics, Addison-Wesley Publishing Company, 
[Reading, Mass. (1963) VoL I, § 50, which is incorporated 
20 "ferein by reference. Various transformation methods 
"(|ometimes referred to as "subband" methods) exist and are 
iWell known. Baylon, David and lim, Jae, 'Transform/ 
|&ibband Analysis and Synthesis of Signals," pp. 540-544, 
[2pPA9(h Gold Coast, Australia, Aug. 27-31 (1990). One 
^jrspch method is the Time-Domain Aliasing Cancellation 
^method (*TDAC*). Another such transformation is known 
^as the Discrete Cosine Transform ("DCT"). The transfor- 
t mation is achieved by applying a transformation function to 
- the original signaL An example of a DCT transformation is: 

30™ .„ 

O JSC*) = £ 2<n) • cos-^j^- fbrOSJkSJV-1 

Cl! 

c " ! =0 otherwise, 

jj= :: - 

35^ vyhere k is the frequency variable and N is typically the 
L iuimber of samples in the window. 
'.;T The transformation produces a set of amplitude coeffi- 
cients of a variable other than time, typically frequency. The 
coefficients can be bom real valued or they can be complex 

40 valued. (If X(k) is complex valued, then the present inven- 
tion can be applied to the real and imaginary parts of X(k) 
.separately, or the magnitude and phase parts of X(k) 
separately, for example. For purposes of discussion, it will 
be assumed, however, that X(k) is real valued.) A typical plot 

45 .of a portion of the signal x(n) transformed to X(k) is shown 
schematically in FIG, 3. If the inverse of the transform 
operation is applied to the transformed signal X(k), then the 
original sampled signal x(n) will be produced. 
The transform is taken by applying the transformation 

so function to a time-wise slice of the sampled analogue signal 
x(n). Hie slice (known as a "frame") is selected by applying 
a window at 104 to x(n). Various windowing methods are 
appropriate. The windows may be applied sequentially, or, 
more typically, there is an overlap. The window must be 

55 consistent with the transform method, in a typical case, the 
TDAC method. As shown in FIG. 2, a window w x (n) is 
applied to x(n), and encompasses forty-eight samples, cov- 
ering a duration of one msec (lxl(T 3 sec). (Forty-eight 
samples have been shown for illustration purposes only. In 

60 a typical application, many more samples than forty-eight 
are included in a window.) The window w 2 (n) is applied to 
the following msec The windows are typically overlapped, 
but non-overlapping windows are shown for illustration 
purposes only. Transformation of signals from one domain 

65 to another, for example from time to frequency, is discussed 
in many basic texts, including: Oppenheim, A. V.. and 
_ Schafer, R. W.. Digital Signal Processing* Bnglewood Qiffs, 



NJ. Prentice Hall (1975); Rabiner, L. R., Gold, B., Theory 
and Application of Digital Signal Processing, Englewood 
Cliffs, NJ., Prentice Hall, (1975), both of which are incor- 
porated herein by reference. 

Application of the transformation, indicated at 106 of 5 
FIG. 4a, to the window of the sampled signal x(n) results in 
a set of coefficients for a range of discrete frequencies. Each 
coefficient of the transformed signal frame represents the 
amplitude of a component of the transformed signal at the 
indicated frequency. The number of frequency components 10 
is typically the same for each frame. Of course, the ampli- 
tudes of components of corresponding frequencies will 
differ from segment to segment 

As shown in FIG. 3, the signal X(k) is a plurality of 
amplitudes at discrete frequencies. This signal is referred to 15 
herein as a "spectrum" of the original signal, According to 
known methods, the next step is to encode the amplitudes for 
each of the frequencies according to some binary code, and 
to transmit or store the coded amplitudes. 

An important task in coding signals is to allocate the fixed 20 
number of available bits to the specification of the ampli- 
tudes of the coefficients. Hie number of bits assigned to a 
coefficient, or any other signal element, is referred to herein 
as the "allocated number of bits'* of that coefficient or signal 
element This step is shown in relation to the other steps at 25 
107 of FIG. 4a. Generally, for each frame, a fixed number of 
bits, N, is available. N is determined from considerations 
such as: the bandwidth of the communication channel over 
which the data will be transmitted; or the capacity of storage 
media; or the amount of error correction needed. As men- 30 
tioned above, each frame generates the same number, C, of 
coefficients (even though the amplitude of some of the 
coefficients may be zero). 

Thus, a simple method of allocating the N available bits 
is to distribute them evenly among the C coefficients, so that 35 
each coefficient can be specified by N/C bits. (For discussion 
purposes, it is assumed that N/C is an integer.) Thus, 
considering the transformed signal X(k) as shown in FIG. 3, 
the coefficient 32, having an amplitude of approximately one 
hundred, would be represented by a code word having the 40 
same number of bits (N/Q as would the coefficient 34, 
which has a much smaller amplitude, of only about ten. 
According to most methods of encoding, more bits are 
required to specify or encode a number within a larger range 
than are required to specify a number within a smaller range, 45 
assuming that both are specified to the same precision. For 
instance, to encode integers between zero and one hundred 
with perfect accuracy using a simple binary code, seven bits 
are required, while four bits are required to specify integers 
between zero and ten. Thus, if seven bits were allocated to 50 
each of the coefficients in the signal, men three bits would 
be wasted for every coefficient that could have been speci- 
fied using only four bits. Where only a limited number of 
bits are available to allocate among many coefficients, it is 
irnportant to conserve, rather than to waste bits. The waste 55 
of bits can be reduced if the range of the values is known 
accurately. 

There are various known methods for allocating the 
number of bits to each coefficient However, all such known 
methods result in either a significant waste of bits, or a 60 
significant sacrifice in the precision of quantizing the coef- 
ficient values. One such method is described in a paper 
entitled "High-Quality Audio Transform Coding at 128 
Khits/s N . Davidson, G., Fielder, L. and Antill, M., of Dolby 
Laboratories. Inc., ICASSP. pp 1117-1120, Apr. 3-6. 65 
Albuquerque, N. Mex. (1990) (referred to herein as the 
"Dolby paper") which is incorporated herein by reference. 
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According to this method, the transform coefficients are 
grouped to form bands, with the widths of the bands 
determined by critical band analysis. Transform coefficients 
within one band are converted to a band block floating-point 
5 representation (exponent and mantissa). The exponents pro- 
vide an estimate of the log-spectral envelope of the audio 
frame under examination, and are transmitted as side infor- 
mation to the decoder. 
Hie log-spectral envelope is used by a dynamic bit 
10 allocation routine, which derives step-size information for 
an adaptive coefficient quantizer. Each frame is allocated the 
same number of bits, N. Hie dynamic bit allocation routine 
uses only the exponent of the peak spectral amplitude in 
each band to increase quantizer resolution for psychoacous- 
15 tically relevant bands. Bach band's mantissa is quantized to 
a bit resolution defined by the sum of a coarse, fixed-bit 
component and a fine, dynamically-allocated component 
The fixed bit component is typically established without 
r;;| regard to the particular frame, but rather with regard to the 
type of signal and the portion of the frame in question. For 
y : { instance, lower frequency bands may generally receive more 
UJ bits as a result of the fixed bit component Hie dynamically 
y allocated component is based on the peak exponent for the 
g =l| band. The log-spectral estimate data is multiplexed with the 
"f p fixed and adaptive mantissa bits for transmission to the 
UJ decoder. 

; v ; j Thus the method makes a gross analysis of the marimtim 
f »l amplitude of a coefficient within a band of the signal, and 
- ; " uses this gross estimation to allocate the number of bits to 
s 30 that band. The gross estimate tells only the integral part of 
the power of 2 of the coefficient For instance, if the 
coefficient is seven, the gross estimate determines that the 
II * maximum coefficient in the band is between 2 2 and 2 3 (four 
M and eight), or, if it is twenty-five, that it is between 2* and 
± 35 2 5 (sixteen and thirty-two). The gross estimate (which is an 
: ~ inaccurate estimate) causes two problems: the bit allocation 
'%[} is not accurate; the bits that are allocated are not used 
L [| efficiently, since the range of values for any given coefficient 
is not known accurately. In the above procedure, each 
40 coefficient in a band is specified to the same level of 
accuracy as other coefficients in the band. Further, informa- 
tion regarding the numming amplitude coefficients in the 
bands are encoded in two stages: first the exponents are 
encoded and transmitted as side information; second, the 
45 mantissa is transmitted along with the mantissa? for the 
other coefficients. 

In addition to determining how many bits to allocate to 
each coefficient for encoding that coefficient's amplitude, an 
encoding method must also divide the entire amplitude 
50 range into a number of amplitude divisions shown at 108 in 
FIG. 4a, and to allocate a code to each division, at 109. The 
number of bits in the code is equal to the number of bits 
allocated for each coefficient The divisions are typically 
referred to as "quantization levels," because the actual 
55 amplitudes are quantized into the available levels, or 'recon- 
struction levels** after coding, transmission or storage and 
decoding. For instance, if three bits are available for each 
coefficient then 2 3 or eight reconstruction levels can be 
identified. 

60 FIG. 5 shows a simple scheme for allocating a three bit 
code word for each of the eight regions of amplitude 
between 0 and 100. Hie code word 000 is assigned to all 
coefficients whose transformed *"np 1 i'»Hft, as shown in FIG. 
3, is between 0 and 12 .5. Thus, all coefficients between 0 and 

65 12.5 are quantized at the same value, typically the middle 
value of 6.25. The codeword 001 is assigned to ail coeffi- 
cients between 12.5 and 25.0. all of which are quantized to 
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the value of 18.75. Similarly, the codeword 100 is assigned 
to all coefficients between 50.0 and 62.5, all of which are 
quantized to die value of 56.25. Rather than assigning 
uniform length codewords to the coefficients, with uniform 
quantization levels, it is also known to assign variable length 5 
codewords to encode each coefficient, and to apply non- 
uniform quantization levels to the coded coefficients. 

It is also useful to determine a masking level. Hie 
masking level relates to human perception of acoustic sig- 
nals. For a given acoustic signal, It is possible to calculate 
approximately the level of signal distortion (for example, 
quantization noise) that will not be heard or perceived, 
because of the signal. This is useful in various applications. 
Fox example, some signal distortion can be tolerated without 
the human listener noticing it The masking level can thus be 
used in allocating the available bits to different coefficients. 15 

The entire basic process of digitizing an audio signal, and 
synthesizing an audio signal from the encoded digital data is 
shown schematically in FIG. 4a and the basic apparatus is 
shown schematically in FIG. 4b, An audio signal, such as 
music, speech, traffic noise, etc, is obtained at 99 by a 20 
known device, such as a microphone. The audio signal x(t) 
is sampled 102, as described above and as shown in FIG. 2. 
The sampled signal x(n) is windowed 104 and transformed 
106. After transformation (which may be a subband 
representation), the bits are allocated 107 among the 25 
coefficients, and the amplitudes of the coefficients are quan- 
tized 108, by assigning each to a reconstruction level and 
these quantized points are coded 109 by binary codewords. 
At this point the data is transmitted 112 either along a 
communication channel or to a storage device. 30 

The preceding steps, 102, 104, 106, 107. 108, 109. and 
112 take place in hardware that is generally referred to as the 
"transmitter." as shown at 150 in FIG. 4b. The transmitter 
typically includes a signal coder (also referred to as an 
encoder) 156 and may include other elements that further 35 
prepare the encoded signal for transmission over a channel 
160. However, all of the steps mentioned above generally 
take place in the coda, which may itself include multiple 
components. 

Eventually, the data is received by a receiver 164 at the 40 
other end of the data channel 160, or is retrieved from the 
memory device. As is well known, the receiver inclndes a 
decoder 166 that is able to reverse the coding process of the 
signal coder 156 with reasonable precision. The receiver 
typically also includes other elements, not shown, to reverse 45 
the effect of the additional elements of the transmitter that 
prepare the encoded signal for transmission over channel 
160. The signal decoder 166 is equipped with a codeword 
table, which correlates the codewords to the reconstruction 
levels. The data is decoded 114 from binary into the quan- 50 
tized reconstruction amplitude values. An inverse transform 
is applied 116 to each set of quantized amplitude values, 
resulting in a signal that is similar to a frame of x(n), i.e. it 
is in the time domain, and it is made up of a discrete number 
of values, for each inverse transformed result However, the 55 
signal will not be exactly the same as the corresponding 
frame of x(n), because of the quantization into reconstruc- 
tion levels and the specific representation used. The differ- 
ence between the original value and the value of the recon- 
struction level can not typically be recovered. A stream of 60 
inverse transformed frames are combined 118, and an audio 
signal is reproduced 120, using known apparatus, such as a 
D/A converter and an audio speaker. 

OBJECTS OF THE INVENTION M 

Thus, the several objects of the invention include, to 
provide a method and apparatus for coding and decoding 
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digital audio-type signals: which permits efficient allocation 
of bits such that in general, fewer bits are used to specify 
coefficients of smaller magnitude than are used to specify 
larger coefficients; which provides for a quantization of the 

5 amplitude of the coefficients such that bands including larger 
coefficients are divided into reconstruction levels differently 
from bands including only smaller coefficients, such that 
both smaller and larger coefficients can be specified more 
accurately than if the same reconstruction levels were used 

10 for all coefficients; which permits accurate estimation of the 
masking level; which permits efficient allocation of bits 
based on the masking level; which robustly localizes errors 
to small portions of the digitized data, and, with respect to 
that data, limits the error to a small, known range; and that 

15 nn'nimi7iw the need to redundantly encode coefficients, all 
allowing a highly efficient use of available bits. 

BRIEF DESCRIPTION OF THE INVENTION 
j In a first preferred embodiment, the invention is a method 
•"r* 20 f° r cncodm S a selected aspect of a signal that is defined by 
y J signal elements that are discrete in at least one dimension, 
I d said method comprising the steps of: dividing the signal into 
[i | at least one band, at least one of said at least one bands 
• having a plurality of adjacent signal elements; in at least one 
- 1 25 band, identifying a signal element having a magnitude with 
|;j a preselected size relative to other signal elements in said 
\. i band and designating said signal element as a **yardstick? 

signal element for said band; and encoding the location of at 
^ - least one yardstick signal element with respect to its position 
s 30 in said respective band. 

In a second preferred embodiment, the invention is a 
1^ method for decoding a code representing a selected aspect of 
: a signal that is defi n ^ by signal elements that are discrete 
Ihfe in at least one dimension, which has been encoded by a 
\\ 35 method comprising the steps of; dividing the signal into at 
Z least one band, at least one of said at least one bands having 
K l J a plurality of adjacent signal elements; in at least one band, 
i-jl identifying a signal element having a magnitude with a 
preselected size relative to other signal elements in said band 
40 and designating said signal element as a 'yardstick** signal 
element for said band; encoding the location of at least one 
yardstick signal element with respect to its position in said 
respective band; and using a function of said encoded 
location of said at least one yardstick signal element to 
45 encode said selected aspect of said signal; said method of 
decoding cornprising the step of translating said encoded 
aspect of said signal based on a function of the location of 
said yardstick signal element that is app ro pri ately inversely 
related to said function of the location used to encode said 
50 selected aspect of said signal. 

In a third preferred embodiment, the invention is an 
apparatus for encoding a selected aspect of a signal that is 
defined by signal elements mat are discrete in at least one 
dimension, said apparatus comprising: means for dividing 
55 the signal into at least one band, at least one of said at least 
one bands having a plurality of adjacent signal elements; in 
at least one band, means for identifying a signal element 
having a magnitude with a preselected size relative to other 
signal elements in said band and means for designating said 
60 signal element as a "yardstick** signal element for said band; 
means for encoding the location of at least one yardstick 
signal element with respect to its position in said respective 
band; and means for quantizing the magnitude of said at 
least one yardstick signal element for which the location was 
65 encoded. 

In a fourth preferred embodiment, the invention is an 
apparatus for decoding a code representing a selected aspect 
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of a signal that is defined by signal elements that are discrete 
in at least one dimension, which has been encoded by a 
method comprising the steps of: dividing the signal into at 
least one band, at least one of said at least one bands having 
a plurality of adjacent signal elements; in at least one band, 5 
identifying a signal element having a magnitude with a 
preselected size relative to other signal elements in said band 
and designating said signal element as a "yardstick" signal 
element for said band; encoding the location of at least one 
yardstick signal element with respect to its position in said 1Q 
respective band; and using a function of said encoded 
location of said at least one yardstick signal element to 
encode said selected aspect of said signal; said decoding 
apparatus comprising means for translating said encoded 
aspect of said signal based on a function of the location of 13 
said yardstick signal element that is appropriately inversely 
related to said function of the location used to encode said 
selected aspect of said signal. 

In a fifth preferred embodiment, the invention is a method 
for encoding a selected signal element of a signal that is ^ 
defined by signal elements that are discrete in at least one 
dimension, said method comprising the steps of: dividing 
the signal into a plurality of bands, at least one band having 
a plurality of adjacent signal elements; in each band, iden- 
tifying a signal element having the greatest magnitude of ^ 
any signal element in said band, and designating said signal 
element as a "yardstick" signal element for said band; 
quantizing the magnitude of each yardstick signal element to 
a first degree of accuracy; and allocating to said selected 
signal element a signal element bit allocation that is a ^ 
function of the quantized magnitudes of said yardstick signal 
elements, said signal element bit allocation chosen such that 
quantization of said selected signal element using said signal 
element bit allocation is to a second degree of accuracy, 
which is less than said first degree of accuracy. 33 

In a sixth preferred embodiment the invention is a method 
for encoding a selected signal element of a signal that is 
defined by signal elements that are discrete in at least one 
dimension, said method comprising the steps of: dividing 
the signal into a plurality of bands, at least one band having 40 
a plurality of adjacent signal elements, one of said bands 
including said selected signal element; in each band, iden- 
tifying a signal element having the greatest magnitude of 
any signal element in said band, and designating said signal 
element as a "yardstick** signal element for said band; 45 
quantizing the magnitude of each yardstick signal element 
only one time; allocating to said selected signal element a 
signal element bit allocation that is a function of the quan- 
tized magnitudes of said yardstick signal elements. 

In a seventh preferred embodiment, the invention is a 50 
method of decoding a selected signal element that has been 
encoded by either of the preferred methods of the invention 
mentioned above, said method of decoding comprising the 
step of translating a codeword generated by the method of 
encoding based on a function of the quantized magnitudes of 55 
said yardstick signal elements that is appropriately inversely 
related to said function of the quantized magnitudes used to 
allocate bits to said selected signal element 

In a eighth preferred embodiment, the invention is an 
apparatus for encoding a selected signal element of a signal 60 
that is defined by signal elements that are discrete in at least 
one dimension, said apparatus comprising: means for divid- 
ing the signal into a plurality of bands, at least one band 
having a plurality of adjacent signal elements, one of said 
bands including said selected signal element; means for 65 
identifying, in each band, a signal element having the 
greatest magnitude of any signal element in said band, and 



8 

designating said signal element as a "yardstick" signal 
element for said band; means for quantizing the magnitude 
of each yardstick signal element to a first degree of accuracy; 
means for allocating to said selected signal element a signal 

5 element bit allocation that is a function of the quantized 
magnitudes of said yardstick signal dements, said signal 
element bit allocation chosen such that quantization of said 
selected signal element using said signal element bit allo- 
cation is to a second degree of accuracy, which is less than 

10 said first degree of accuracy. 

In a ninth preferred embodiment, the invention is an 
apparatus for decoding a codeword representing a selected 
signal element of a signal that has been encoded by a method 
of the invention mentioned above, the apparatus comprising 

I* means for translating said codeword based on a function of 
the quantized magnitudes of said yardstick signal elements 
that is appropriately inversely related to said function of the 
quantized magnitudes used to allocate bits to said selected 
f :r s signal element 

3f BRIEF DESCRIPTION OF THE FIGURES 

[ij FIG. 1 shows schematically an audio-type signal. 
[; j FIG. 2 shows schematically an audio-type signal that has 
been sampled. 

f^. FIG. 3 shows schematically the spectrum of an audio-type 
ly signal transformed from the time domain to the frequency 
%J domain. 

f ] | FIG. 4a shows schematically the digital processing of an 
^ audio-type signal according to known methods. 
* FIG. 4b shows schematically the hardware elements of a 
C»J known digital signal processing system. 
f|l FIG. 5 shows schematically the division of the amplitude 
l -. of coefficients into reconstruction levels, and the assignment 
'is of codewords thereto, according to methods known in the 
prior art. 

;Jj FIG. 6 shows schematically the division of a spectrum of 
an audio-type signals into frequency bands according to the 
prior art 

40 FIG. 7 shows schematically the spectrum of FIG. 6, after 
application of a scaling operation, further designated yard- 
stick coefficients within bands. 

FIG. 7a shows schematically how the yardstick coeffi- 
cients are used to establish a rough estimate of K(k)l°. 
45 FIG. 8 shows schematically the division of the amplitude 
of coefficients in different bands into different reconstruction 
levels, according to the method of the invention. 

FIG. 9a shows schematically one choice for assignment 
of reconstruction levels to a coefficient that may have only 
50 a positive value. 

FIG. 9b shows schematically another choice for assign- 
ment of reconstruction levels to a coefficient that may have 
only a positive value. 
55 FIG. 10a shows schematically one choice for assignment 
of reconstruction levels to a coefficient that may have either 
a positive or a negative value. 

FIG. 106 shows schematically another choice far assign- 
ment of reconstruction levels to a coefficient that may have 
either a positive or a negative value. 

FIG. 11 shows schematically how the pi^ g^ itnrffts of 
yardstick coefficients can be used to allocate die number of 
bits for a band. 

FIG. 12. in parts 12a, 122? and 12c shows schematically 
6s the steps of the method of the invention. 

FIG. 13, in parts 13a and 136 shows schematically the 
components of the apparatus of the invention. 
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A first preferred embodiment of the invention is a method 
of allocating bits to individual coefficients, for the encoding 
of the magnitude (Le, the absolute value of the amplitude) of 
these coefficients. According to the method of the invention, 
an audio signal x(t) is obtained as in FIG. 4a at 99, and 
sampled at a suitable rate, such as 48 kHz as at 102, resulting 
in x(n). The sampled signal is windowed and transformed, as 
at 104 and 106, according to a known, suitable technique, 
such as TDAC or DCT, using an appropriate window of a 
typical size, eg. 512 or 1024 samples. It will be understood 
that other transformation and windowing techniques are 
within the scope of the present invention. If no transforma- 
tion is performed, the invention is applied to sampled signal 
elements rather than coefficient signal elements. In fact, the 
invention is beneficially applied to non-transformed, 
sampled audio-type signals. Transformation is not 
necessary, but merely exploits certain structural character- 
istics of the signal. Thus, if the transformation step is 
skipped, it is more difficult to exploit the ordering. The result 
is a spectrum of coefficient signal elements in the frequency 
domain, such as is shown in FIG. 3. As used herein, the 
phrase "signal elements" shall mean portions of a signal, in 
generaL They may be sampled portions of an untransformed 25 
signal or coefficients of a transformed signal or an entire 
signal itself. The steps of the method are shown schemati- 
cally in flow chart form in FIGS. 12a, 126 and 12c. 

An important aspect of the method of the invention is the 
method by which the total number of bits N are allocated 
among the total number of coefficients, C According to the 
method of the invention, the number of bits allocated is 
correlated closely to the amplitude of the coefficient to be 
encoded. 

The first step of the method is to divide the spectrum of 
transform coefficients in X(k) into a number B of bands, 
such asB equal sixteen or twenty-six. This step is indicated 
at 600 in FIG. 12a. It is not necessary for each band to 
include the same number of coefficients. In fact, it may be ^ 
desirable to include more frequency coefficients in some 
bands, such as higher frequency bands, man in other, lower 
frequency bands. In such a case, it is beneficial to approxi- 
mately follow the critical band result An example of the 
spectrum X(k) (for X(k) having real values) is shown 45 
schematically in FIG. 6, divided into bands. Other typical 
spectra may show a more marked difference in the number 
of coeffici e nts per band, typically with relatively more 
coefficients in the higher rather than the lower bands. 

If the number of frequency coefficients in each band is not 50 
uniform, men the pattern of the bandwidth of each band 
must be blown or commiinicated to the decoding elements 
of the apparatus of the invention. The non-uni form pattern 
can be set, and stored in memory accessible by the dftmHpr 
If, however, the bandwidth of the bands is varied "on-the- 35 
fly " based on local characteristics, then the decoder must be 
made aware of these variations, typically, by an explicit 
message indicating the pattern 

As shown in FIG. 6, the spectrum is divided into many 
bands, b lf bj, . . . b^, indicated by a small, dark square 60 
between bands. It is useful as explained below, if each band 
is made up of a number of coefficients that equals a power 
of two. At this point, it is also possible to ignore frequencies 
that are not of interest, for instance because they are too high 
to be discerned by a human listener. 

It may be useful although not necessary for the invention, 
to analyze the spectrum coefficients in a domain where the 
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spectrum magnitudes are compressed through non-linear 
mapping such as raising each magnitude to a fractional 
power a, such as V4, or a logarithmic transformation. The 
human auditory system appears to perform some farm of 
5 amplitude compression. Also, non-linear mapping such as 
amplitude compression tends to lead to a more uniform 
distribution of the amplitudes, so mat a uniform quantizer is 
more efficient Non-linear mapping followed by uniform 
quantization is an example of the well known non-uniform 
10 quantization. 

This step of non-linear mapping is indicated at 602 in 
FIG. 12a. The transformed spectrum is shown in FIG. 7, 
which differs from FIG. 6, in the vertical scale. 
In each band of the exponentially scaled spectrum, the 
13 coefficient Cb v Q> 2 , . . . Cb B having the largest magnitude 
(ignoring sign) is designated as a ''yardstick coefficient" 
This step is indicated at 608 in FIG. 12a. The yardstick 
coefficients are indicated in FIG. 7 by a small rectangle 
"!:: enclosing the head of the coefficient marker. (In another 
preferred embodiment discussed below, rather than desig- 
| i J nating the coefficient that has the maximum coefficient in the 
|. § band as the yardstick coefficient another coefficient can be 
;;1 designated as the yardstick. Such other coefficient can be the 
y i one having a median or middle amplitude in the band, or a 
lif high, but not the largest magnitude in the band, such as the 
-I. I second or third highest The embodiment designating the 
maximum magnitude coefficient as the yardstick is the 
%i * predominant example discussed below, and is discussed 
= first) 

30 

r ; i The method of the invention entails several embodiments. 
?Z According to each, the magnitude of the yardstick coeffi- 
; j 5 dents is used to allocate bits efficiently among the 
& ^ coefficients, and also to establish the number and placement 
- ^ of reconstruction levels. These various enibodiments are 
v l discussed in detail below, and are indicated in FIGS. 12a and 
\y 12b. More specific embodiments include: to further divide 
U1 the spectrum X(k) into split-bands at 612; to accurately 
quantize the location and the sign of the yardstick coeffi- 
^ dents at 614; and to perform various transformations on 
these quantized coefficients at 616, 618 and 620 before 
transmitting data to the decoder. However, the basic method 
of the invention in its broadest implementation does not 
employ split-bands, thus passing from split-band decision 
43 610 to quantization decision step 614. In the basic method, 
only the magnitude of the yardstick coefficients is used, and 
thus the method passes from quantization decision step 614 
to magnitude transformation decision step 622. The magni- 
tudes need not be transformed at this stage, and thus, the 
^ basic method passes directly to step 624, where the magni- 
tude of the yardstick coefficients are quantized accurately 
into reconstruction levels. 

The magnitude of each of yardstick coefficient is quan- 
tized very accurately, in typical cases, more accurately than 
55 is the magnitude of non-yardstick coefficients. In some 
cases, this accurate rendering is manifest as using mare bits 
to encode a yardstick coefficient (on average) than to encode 
a non-yardstick coefficient (on average). However, as is 
explained below with respect to a yardstick-only transf ar- 
60 mation step performed at step 622, this may not be the case. 
In general the higher accuracy of the yardsticks (on 
average) is characterized by a smaller divergence between 
the original coefficient value and the quantized value, as 
compared to the divergence between the same two values for 
55 a non-yardstick coefficient (on average). 

After quantization, the yardstick coefficients are encoded 
into codewords at 626 (FIG. 122>) and transmitted at 628 to 
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the receiver. The coding scheme may be simple, such as 
applying the digital representation of the position of the 
reconstruction level in an ordered set of reconstruction 
levels, from lowest amplitude to highest Alternatively, a 
more complicated coding scheme, such as using a codebook. 5 
may be used. As in the case with the receiver of the prior art, 
the apparatus of the invention includes a receiver having a 
decoder equipped to reverse the coding processes imple- 
mented by the coding apparatus. If a simple coding tech- 
nique is used, the receiver may simply reverse the technique, 10 
Alternatively, a codebook may be provided, which correlates 
the codewords assigned to the yardstick coefficients with the 
reconstruction levels. Because the yardstick coefficients are 
quantized very accurately, when the codewords are trans- 
lated and the coefficients are reconstructed, they are very 15 
dose to the original values. (The next step 632 shown in 
FIG. 12b is only implemented if one of the transformation 
steps 616, 618 or 620 of FIG. 12a were conducted. The 
embodiments where these steps are conducted are discussed 
below.) M 

The accurately quantized magnitudes of the yardstick 
coefficients are used to allocate bits among the remaining 
coefficients in the band. Because, in this first discussed 
embodiment, each yardstick coefficient is the coefficient of 
greatest magnitude in the band of which it is a member, it is 25 
known that all of the other coefficients in the band have a 
magnitude less than or equal to that of the yardstick coef- 
ficient. Further, the magnitude of the yardstick coefficient is 
also known very precisely. Thus it is known how many 
coefficients must be coded in the band having the largest 30 
amplitude range, the next largest the smallest, etc, Bits can 
be allocated efficiently among the bands based on this 
knowledge. 

There are many ways that the bits can be allocated. Two 
significant general methods are: to allocate bits to each band, 35 
and then to each coefficient within the band; or to allocate 
hits directly to each coefficient without previously allocating 
bits to each band. According to one embodiment of the first 
general method initially, the number of bits allocated for 
each individual band are determined at 634. More coeffi- 40 
dents in a band will generally result in more bats being 
required to encode all of the coefficients of that band. 
Similarly, a greater average magnitude DC(k)t a of the coef- 
ficients in the band will result in more bits being required to 
encode all of the coefficients of that band. Thus, a rough 45 
measure of the "size" of each band, "size* being defined in 
terms of the number of coefficients and the magnitude of the 
coefficients, is determined, and then the available bits are 
allocated among the bands in accordance with their relative 
sizes, larger bands getting more bits, smaller bands getting so 
fewer bits. 

For instance, as shown in FIG. 7a, for a very rough 
estimate, it can be assumed that the magnitude of each 
coefficient is the same as the yardstick for that band. This is 
indicated in FIG. 7a by a heavily cross-hatched box, having 55 
a magnitude equal to the absolute value of the amplitude of 
the yardstick coefficient As can be understood from a 
comparison of FIG. 7 with FIG. 7a t in order to acquire a 
rough estimate for the size of each band, it is assumed that 
all coefficients are positive. Knowing the number of coef- 60 
ficients in each band, it is then possible to establish an upper 
bound for the size of the band. In an informal sense, this 
analysis is similar to deterrnining the energy content of the 
band, as compared to the entire energy content of the frame. 
Once the relative sizes are determined, well known tech- 65 
niques are applied to allocate the available bits among the 
bands according to the estimated sizes. One technique is set 
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forth at Lim, J. S., Two-Dimensional Signal and Image 
Processing, Prcntioc Hall, Eoglcwood dins, NJ. (1990), p. 
598, incorporated herein by reference. Experience may also 
show that it is beneficial to allocate bits among the bands by 

5 assuming that the average magnitude K(k)t a of each non- 
yardstick coefficient is equal to some other fraction of the 
magnitude of the yardstick, such as one-half. This is shown 
in FIG. la by the less heavily cross-hatched boxes spanning 
the bands of the signaL It should be noted that the heavy 

1Q cross-hatched regions extend all the way down to the 
frequency axis, although the lower portion is obscured by 
the less heavily cross-hatched regions. 

It is also possible to adjust the estimate for the size of the 
band depending on the number of coefficients (also known 
as frequency samples) in the band. For instance, the more 

15 coefficients, the less likely it is that the average magnitude 
is equal to the magnitude of the yardstick coefficient In any 
case, a rough estimate of the size of the band facilitates an 
appropriate allocation of bits to that band. 
Within each band, bits are allocated at 636 among the 

20 coefficients. TYpically, bits are allocated evenly, however, 
any reasonable rule can be applied. It should be noted that 
the magnitudes of the yardstick coefficients have already 
been quantized, encoded and transmitted and do not need to 
be quantized, encoded or transmitted again. According to the 

25 prior art discussed in the Dolby paper, aspects of the 
coefficients used to make a gross analysis of the maximum 
magnitude of a coefficient within a band are encoded at two 
different stages; first with respect to the exponent and second 
with respect to the mantissa. 

30 As is mentioned above, rather than first allocating bits 
among the bands, and then allocating bits among the coef- 
ficients in each band, it is also possible to use the estimate 
of K(k)l a to allocate bits to the coefficients directly without 
the intermediate step of allocating bits to the bands. Again, 

35 the rough estimate K(k)i a is used to provide a rough 
estimate for the magnitude of every coefficient As illus- 
trated in FIG. 7a, the rough estimate for the magnitude of 
each coefficient may be the magnitude of the yardstick 
coefficient, or one-half that magnitude, or some other rea- 

40 sonable method. (As discussed below, a more complicated, 
yet more useful estimation is possible if information regard- 
ing the location of the yardstick coefficients is also accu- 
rately noted and encoded.) From the estimate of the mag- 
nitude of each of the coefficients, an estimate of the total 

45 magnitude or size of the signal can be made, as above, and 
the ratio of the size of the coefficient to the total size is used 
as the basis for allocating a number of bits to the coefficient 
The general technique is discussed at i-iny j. s., cited above 
at p. 598. 

so Due to the accurate quantization of the yardstick 
coefficients, the present invention results in a more appro- 
priate allocation of bits to coefficients in each band than does 
the method drecriheri in the prior art Dolby paper. Consider, 
for example, the two bands b 4 and b 3 (FIG. 8), having 

55 yardstick coefficients 742 and 743, respectively, with mag- 
nitudes of nine and fifteen, respectively. According to the 
prior art method, each yardstick coefficient is quantized 
grossly, by encoding only the exponent of the yardstick, and 
this gross quantization is used to allocate bits to all of the 

60 coefficients in the yardstick's band. Thus, yardstick coeffi- 
cient 742, having a value of nine, would be quantized by the 
exponent "3 M , since it falls between 2 3 and 2\ Since fifteen 
is the maximum number mat could have this exponent, the 
band in which yardstick coefficient 742 falls is allocated bits 

65 as if the maximum value for any coefficient were fifteen. 
Further according to the prior art method, yardstick coef- 
ficient 743, having a value of fifteen, would also be quan- 
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tized by exponent "3", since it too falls between 2 3 and 2\ 
Thus, the band in which yardstick coefficient 743 foils is also 
allocated bits as if the maximum value for any coefficient 
were fifteen. Thus, although the two bands have significantly 
different yardstick coefficients, each coefficient in the band 5 
is allocated the same number of bits. For illustration 
purposes, it can be assumed that each coefficient in the two 
bands is allocated four bits for quantization. 

Conversely, according to the method of the invention, 
because the yardstick coefficients are quantized very 10 
accurately, yardstick coefficient 743, having a value of 
fifteen, is quantized to fifteen, or very close to fifteen if very 
few bits are available. Further, yardstick coefficient 742, 
having a value of nine, is quantized as nine, or very close to 
nine. Thus, the coefficients in band b 4 will be allocated a 15 
different number of bits than will the coefficients in band b 5 . 
For purposes of illustration, it can be assumed that the 
coefficients in band b 5 , having a yardstick of magnitude 
fifteen, are each allocated five bits, while coefficients in band 
b 4 , having a yardstick of only nine, are each allocated only 20 
three bits. 

Comparison to the bit allocation of the method of the 
invention to the prior art method shows that the allocation 
according to the method of the invention is much more 
appropriate. For band b 5 , more bits are available (five as 25 
compared to four) so the quantization will be more accurate. 
For band b 4 , fewer bits are used (three as compared to four), 
however, since the range is in fact smaller than the prior art 
method can determine (nine as compared to fifteen), the 
allocation of bits is more appropriate. Further, because the 30 
invention also uses the accurate yardstick quantization to 
establish reconstruction levels, which the method of the 
prior art does not, the relative accuracy achieved is even 
greater, as is next explained. 

Once each coefficient has been allocated its allotment of 
bits at 636, the highly accurate quantization of the yardstick 
coefficients can be used to divide up the entire range of the 
band appropriately and to assign reconstruction levels at 
638. FIG. 8 shows the reconstruction level allocation sche- ^ 
matically. The yardsticks 743 and 742 of bands b 5 and b 4 are 
shown, along with non-yardstick coefficients 748 and 746, 
the former falling in band b 4 and the latter falling in band b s , 
both of which have a magnitude of five. Following through 
with the example considered above, allocation of recon- 4J 
sanction levels according to the present invention and the 
prior art method is illustrated. Since according to the prior 
art, coefficients in both bands were assigned the same 
number of bits, four, for reconstruction levels, each band 
will have 2 4 or sixteen reconstruction levels. These recon- 5Q 
sanction levels are shown schematically by identical scales 
750 at either side of the FIG. 8. (The reconstruction levels 
are illustrated with a short scale line shown at the center of 
each reconsonction level). 

The reconstruction levels that would be assigned accord- 55 
ing to the method of the invention are quite different from 
those of the prior art, and. in fact, differ between the two 
bands. In the example, band b 5 was assigned five bits per 
coefficient so 2 5 or thirty-two reconstruction levels are 
available to quantize coefficients in this band, having a & 
yardstick of fifteen. These reconstruction levels are shown 
schematically at scale 780. Band b 4 was assigned only three 
bits, so 2 3 or eight reconstruction levels are available for 
quantization of coefficients in this band, having a yardstick 
of nine. These reconstruction levels are shown at scale 782. 65 

Comparison of the accuracy of the two methods shows 
that the method of the invention provides greater efficiency 
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than does the prior art Fox the coefficients in band b 5 , the 
thirty-two reconstruction levels provided as a result of the 
five bit allocation clearly provide for more accuracy than do 
the sixteen levels provided as a result of the four bit 
allocation of the prior art Further, all of the thirty-two 
reconstruction levels are useful. For the coefficients in band 
b 4 , the eight reconstruction levels provided as a result of the 
present invention do not provide as many reconstruction 
levels as the sixteen provided by the prior art, however, all 
of the eight reconstruction levels provided are used, while 
several of the reconstruction levels of the prior art (those 
falling between nine and fifteen) can not possibly be useful 
for this band, since no coefficient exceeds nine. Thus, 
although mere are technically more reconstruction levels 
allocated to this band as a result of the method of the prior 
art many of them can not be used, and the resulting gain in 
accuracy is smalL The bits that are consumed in the alloca- 
tion of the unused reconstruction levels could be better used 
O in the same band by reassignment of the reconstruction 
~'k ^ levels to lie in the known accurate range, or in another band 
(such as band b s , where the maximum range is relatively 
large). 

|ij The placement of the boundaries between reconstruction 
§ j | levels and the assignment of reconstruction values to the 
T I 25 reconstruction levels within the range can be varied to meet 
U J specific characteristics of the signal If uniform reconstruc- 
Sl tion levels are assigned, they can be placed as shown in FIG. 
£ r| 9a, at scale 902 spanning a range of ten, with the highest 
* ; reconstruction level being assigned the yardstick value, and 
5 30 each lower level being assigned a lower value, lessened by 
O an equal amount, depending on the level size. In such a 
scheme, no reconstruction level will be set to zero. 
Alternatively, as shown as scale 904, the lowest reconstruc- 
^ tion level can be set to zero, with each higher level being 
,.j 35 greater by an equal amount In such a case, no reconstruction 
■■I level will be set to the yardstick. Alternatively, and more 
'f typically, as shown at scale 906, neither the yardstick nor the 
•J zero will be quantized exactly, but each will lie one-half of 
a reconstruction level away from the closest reconstruction 
40 level 

As in the case of uneven allocation of bits to coefficients 
in a band, if mare than one reconstruction scheme can be 
applied by the encoder, then either a signal must be trans- 
mitted to the decoder along with the data pertaining to the 

45 quantized coefficients indicating which reconstruction 
scheme to use, or the decoder must be constructed so that in 
all situations, it reproduces the required distribution of 
reconstruction levels. This information would be transmitted 
or generated in a manner analogous to the manner in which 

so the specific information pertaining to the number of coeffi- 
cients per band would be transmitted or generated, as 
discussed above. 

Rather than divide op the amplitude of the band evenly, it 
may be beneficial to divide it at 638 as shown in FIG. 9b, 

55 sr>ecifying reconstruction levels that include and reconstruct 
exactly both zero and the yardstick coefficient and skewing 
the distribution of the other reconstruction levels more 
toward the yardstick coefficient end of the range. 
Alternatively, the reconstruction levels could be clustered 

60 more closely at the zero end of the range, if experience 
demonstrates mat this is statistically more likely. Thus, in 
general, the quantization levels can be non-uniform, tailored 
to the characteristics of the particular type of signal. 
The foregoing examples have implicitly assumed that the 

65 yardstick coefficient is greater than zero and that all of the 
other coefficients are greater than or equal to zero. Although 
this can happen, many situations will arise where either or 
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both of these assumptions will not lie. Id order to specify the 
sign of the non-yardstick coefficients, several methods are 
possible. The most basic is to expand the amplitude range of 
the band to a range having a magnitude of twice the 
magnitude of the yardstick coefficient, and to assign at 638 5 
reconstruction levels, as shown in FIG. 10a. For instance, 
any coefficient falling in the zone lying between amplitude 
values of 2.5 and 5.0, will be quantized at 640 as 3.75 and 
will be assigned at 642 the three bit code word "101**. As 
will be understood, the precision of such an arrangement is 
only one half as fine as that which would be possible if it 10 
were only necessary to quantize positive coefficients. Nega- 
tive values, such as those lying between -5.0 and -7.5 will 
also be quantized as -6.25 and will be assigned the code- 
word "001". 

Rather than an equal apportionment to positive and nega- 15 
tivc values, it is possible to assign either the positive or 
negative reconstruction levels more finely, as shown in FIG. 
10*. In such a case, it will be necessary to give more 
reconstruction levels to either the positive or the negative 
portion of the range. In FIG. 10*, the positive portion has 20 
four full reconstruction levels and part of the reconstruction 
level centered around zero, while the negative portion has 
three full reconstruction levels and part of the zero-centered 
reconstruction level. 

The foregoing examples demonstrate that with very accu- 25 
rate quantization of the yardsticks, very accurate range 
information for a particular band can be established. 
Consequently, the reconstruction levels can be assigned to a 
particular band more appropriately, so that the reconstructed 
values arc closer to the original values. The method of the 30 
prior art results in relatively larger ranges for any given 
band, and thus less appropriate assignment of reconstruction 
levels. 

The estimation of the masking level is also improved over 
the prior art with application of the method of the invention. 35 
Estimation of the masking level is based upon an estimation 
of the magnitude of the coefficients K(k)l. As has been 
mentioned, in general, for each coefficient, the masking level 
is a measure of how much noise, such as quantization noise, 
is tolerable in the signal without it being noticeable by a 40 
human observer. In most applications, signals of larger 
amplitude can withstand more noise without the noise being 
noticed. Factors in addition to amplitude also figure into the 
masking level determination, such as frequency and the 
amplitudes of surrounding coefficients. Thus, a better esti- 4s 
mation of K(k)l, for any given coefficient results naturally in 
a better estimation of an appropriate masking level. The 
masking level is used to fine-tune the allocation of bits to a 
coefficient If the coefficient is situated such that it can 
tolerate a relatively high amount of quantization noise, then 50 
the bit allocation takes this into account, and may reduce the 
number of bits that would be allocated to a specific coeffi- 
cient (or band) as compared to the number that would have 
been applied if the masking level were not taken into 
account 35 

After the coefficients are encoded according to the method 
of the invention, the stream of codewords are transmitted at 
644 to the communication channel, or storage device, as in 
the prior art shown in FIG. 3 at 112. After transmission, the 
coded words are transformed back into an audio signal. As 60 
shown in FIG. 12c, at 660 the coded yardstick coefficients 
are quantized based on the assignment of reconstruction 
levels to the codewords. The yardstick coefficients have 
been quantized very accurately. Thus, upon translation of the 
codewords into reconstructed levels, the reconstructed yard- 65 
stick coefficients will very accurately reflect the original 
yardstick coefficients. 
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At 662, a decision is made whether or not to perform a 
reverse DOT transform (or other appropriate transform) to 
counteract any DCT type transform (discussed below) mat 
may have been applied at steps 616, 618 or 620 in the 
5 encoder. If so, the reverse transform is applied at 664. If not, 
the method of (he invention proceeds to 666, where the 
codewords for the non-yardstick coefficients of a single 
frame are translated into quantization levels. Many different 
schemes are possible and are discussed below. 
10 The decoder translates the codewords into quantization 
levels by applying an inverse of the steps conducted at the 
encoder. From the yardstick coefficients, the coder has 
available the number of bands and the magnitudes of the 
yardsticks. Either from side information or from preset 
15 information, the number of non-yardstick coefficients in 
each band is also known. From the foregoing, the recon- 
struction levels (number and locations) can be established by 
the decoder by applying the same rule as was applied by the 
f :: % encoder to establish the bit allocations and reconstruction 
.1* 20 ievcls * merc ^ onl Y one suck ruk* me decoder simply 
applies it If there are more than one, the decoder chooses the 
III appropriate one, either based on side information or on 
| ; j intrinsic characteristics of the yardstick coefficients. If the 
codewords have been applied to the reconstruction levels 
25 according to a simple ordered scheme, such as the binary 
representation of the position of the reconstruction level 
j; from lowest arithmetic value to highest, then that scheme is 
.;!f simply reversed to produce the reconstruction level. If a 
y 3 more complicated scheme is applied, such as application of 
e 30 a codebook, then that scheme or codebook must be acces- 
<p| sible to the decoder. 

The end result is a set of quantized coefficients for each 
^ of the frequencies that were present in the spectrum X(k). 

These coefficients will not be exactly the same as the 
t 5 35 original, because some information has been lost by the 
quantization. However, due to the more efficient allocation 
\U of bits, better range division, and enhanced masking 
ij\ estimation, the quantized coefficients are closer to the origi- 
nal than would be requantized coefficients of the prior art 
40 (However, reconstituted non-yardstick coefficients typically 
do not compare to the original non-yardstick coefficients as 
accurately as the reconstituted yardstick coefficients com- 
pared to the original yardstick coefficients.) After 
requantization, the effect of the operation of raising the 
45 frame to the fractional power a, such as Vi, is undone at 668 
by raising the values to the reciprocal power 1/a, in this 
case, two. Next, at 670 the inverse transform of the TDAC 
type transform applied at step 106 is applied to transform the 
frequency information back to the time domain. The result 
50 is a segment of data, specified at the sampling rate of. for 
instance, 48 kHz. Sequential (typically overlapped) win- 
dows are combined at 672 and audio is synthesized at 674. 

The foregoing discussion has assumed that only the 
magnitude of the yardstick coefficients were encoded accu- 
55 rately at 614, and mat neither the location of the yardstick 
coefficient within the band (ie. second coefficient from the 
low frequency end of the band, fourth coefficient from the 
low frequency end of the band, etc.) nor the sign (or phase) 
was encoded. By encoding either the location, or both of 
60 these additional facts, additional improvement in coding can 
be achieved. In fact, encoding of the location provides 
significant savings, since if not it would be necessary to 
encode the yardstick coefficient twice: once to establish the 
estimation of K(k)! a and a second time for its contribution 
65 to the signal as a coefficient 

If the location of the yardstick coefficient had not been 
encoded, it would be necessary to encode its magnitude in 



17 

the stream of all coefficients, for instance at step 624 shown 
in FIG. 12A. However, if the yardstick coefficients are fully 
encoded with magnitude and location and sign, then their 
coded values can simply be transmitted. If the location is not 
coded, then the apparatus must first transmit the magnitudes 5 
of each yardstick, eg. at step 628 in FIG. 12b. Subsequently, 
bits are allocated to each band, and to each coefficient within 
the band, including the yardstick coefficient at step 636. If 
yardstick location information has not been stored the 
system is insensitive to the special identity of the yardstick 1Q 
and allocates bits to it at 636, quantizes it into a reconstruc- 
tion level at 640. encodes it at 642 and transmits its 
amplitude at 644. Thus, its amplitude is transmitted twice: 
first at 628 and second at 644. 

If. however, the location is coded originally at 626, when 15 
the system prepares to allocate bits to the yardstick at 636, 
the yardstick coefficient will be identified as such, due to its 
location, and will be skipped thus saving the bits necessary 
for coding its amplitude. Specifying the location of the 
yardsticks typically only improves efficiency if fewer bits 20 
are required to specify its location than to specify its 
amplitude. In some cases it may be beneficial to code the 
locations of certain yardsticks signal elements, but not all. 
For instance, if a band includes a great number of 
coefficients, it may not be advantageous to encode the ^ 
location of the yardstick in that band however it may still be 
beneficial to encode the location of a yardstick coefficient in 
a band having fewer coefficients. Further, in assessing the 
advantage from specifying the location of the yardstick 
coefficients, the probable additional computation and per- ^ 
haps memory burdens required at both the coding and 
decoding apparatus must be considered, in light of the 
available data channel bandwidth. Topically, it is more cost 
effective to accept higher computational or memory burdens 
than bandwidth burdens. 35 

If at 614 (FIG. 12a) it is decided to quantize the location 
of the coefficient in the band accurately, a few additional bits 
will be necessary to specify and encode each yardstick 
coefficient Typically, the number of coefficients that will be 
in each band is decided before the coefficients are coded 40 
This information is typically known to the decoder, although 
it is also possible to vary this information and to include it 
in the side information transmitted by the encoder. Thus, for 
each band, the location of the yardstick coefficient can be 
exactly specified and it is only necessary to reserve enough 45 
bits for the location information as are required by the 
number of coefficients in the band in question. For mis 
reason, it is beneficial to assign coefficients to each band 
numbering a power of two, so that no bits are wasted in the 
specification of the location of the yardstick coefficient 50 

As has been mentioned above, a basic method to allocate 
bits within the band is to allocate an equal number of bits to 
each non-yardstick coefficient However, in some cases, this 
cannot be done, for instance when the number of bits 
available is not an integer multiple of the number of non- 55 
yardstick coefficients. In this case, it is frequently beneficial 
to give more bits to the coefficients that are closest (in 
location within the band) to the yardstick coefficient, 
because experience has shown that for audio-type signals, 
adjacent coefficients are often closer to each other in mag- 60 
nitude than are distant coefficients. 

There are various other uses to which extra bits can be put 
For instance, more preference can be given to coefficients 
lying to the left of the yardstick coefficient, i.e. of a lower 
frequency man the yardstick coefficient This is in consid- 65 
eration of the masking result Typically, the impact of a 
specific frequency component on the masking function 
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occurs with respect to a higher frequency region than the 
frequency in question. Therefore, giving preference to coef- 
ficients of lower frequency than the yardstick, (thus lying to 
the left of the yardstick on a conventional scale such as 

5 shown in FIG. 11) will more accurately encode the coeffi- 
cient that has impact on the higher frequency components. 
In some circumstances, it may even be beneficial to favor 
those lower frequency coefficients more heavily than with 
just the single extra bit available from an odd number of 

1Q extra bits. For instance, additional bits could be given to five 
coefficients on the lower side of the yardstick, but only to 
two on the higher side. 

Thus, accurately specifying the location of the yardstick 
coefficient within the band allows further more appropriate 
allocation of the bits among the various non yardstick 

15 coefficients. With more appropriate allocation of bits per 
non-yardstick coefficient, the division of the bits into appro- 
priate reconstruction levels, as discussed above, is further 
enhanced. 

, T Knowing the location of the yardstick coefficients also 
^ 20 permits a better rough estimation of K(k)l°, which in turn 
'i allows a better estimation of the masking function. If the 
I locations of the yardstick coefficients are known, then the 
| estimation of K(k)i a can be as shown in FIG. 11 , rather than 
I as shown in FIG. 7a. Without the location information, all 
i 25 that can be estimated is that the coefficients in the band are 
J on average each less than some fraction of the magnitude of 
] the yardstick coefficient However, knowing the locations 
I enables the typically more accurate estimation shown in 
I FIG. 11, where each non-yardstick coefficient is assigned an 
30 estimated value based on the relationship between adjacent 
yardsticks. Hie assumption underlying such an estimation is 
that the magnitudes of coefficients does not change very 
much from one coefficient to the next, and thus, the non- 
yardstick coefficients will generally lie along the lines 
35 connecting the adjacent yardsticks. Thus, once the more 
<* refined estimate for the K(k)l** is acquired, the estimates far 
| the individual coefficients can be used to implement either of 
\ the two modes of allocating bits: the bit allocation for the 
* bands followed by the bit allocation for the coefficients; or 
40 the direct bit allocation for the coefficients. Further, this 
refined estimate can also be used to establish the masking 
level more appropriately. Thus, the bit allocation, and con- 
sequently also the range allocation, is ^an^ by encoding 
the location of the yardsticks. 
45 If the location of each yardstick coefficient has been 
specified, then it is possible without redundancy to go back 
to any yardsticks mat have been encoded and enhance the 
accuracy of their coding if more bits are available than was 
assumed at the time of yardstick encoding. For instance, the 
50 particular band may gave received a very large number of 
bits due to the very large yardstick, but may not require such 
a large number of bits to encode the other signal elements, 
due to a very small number of signal elements being in the 
band. If the locations are known, more bits can be allocated 
55 to specifying the amplitude of the yardstick coefficient after 
the first pass of allocation of bits to yardsticks. If the 
locations are not known, it can not be done efficiently 
without redundancy. One way to further specify the magni- 
tude of the yardstick would be to use the extra bits to encode 
60 the difference between the magnitude of the yardstick first 
encoded, and the original yardstick amplitude. Because the 
decoding apparatus will be employing the same routines to 
determine how bits have been allocated as were used by the 
encoder, the decoder will automatically recognize the 
65 enhanced yardstick amplitude information properly. 

Additional coding efficiency and accuracy can be 
achieved by accurately specifying and encoding the sign of 
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the yardstick coefficient (which corresponds to the phase of 
the signal components at that frequency). Only one addi- 
tional bit per yardstick coefficient is necessary to encode its 
sign if X(k) is real-valued. 

Knowing the sign of the yardstick coefficient enhances the 5 
ability of the method to efficiently determine reconstruction 
levels within a given band. For instance, experience indi- 
cates that a band may often include more non-yardstick 
coefficients having the same sign as the yardstick coefficient 
Therefore, it may be beneficial to provide one or two more 10 
reconstruction levels having that sign. 

Knowing the sign of the yardstick does not generally 
enhance estimation of the masiring effect The usefulness of 
the sign information varies depending upon which transform 
has been used. 15 

Another preferred embodiment of the method of the 
invention is particularly useful if the number of bands is 
relatively sraalL This embodiment entails a further division 
of each band in the spectrum X(k) into two split-bands at 
step 612 of FIG. 12a. One split-band includes the yardstick 
coefficient and the other does not The split-bands should, 
preferably, divide the band roughly in half. The coefficient of 
greatest magnitude in the split-band that does not contain the 
yardstick coefficient is also selected at 650 and quantized at 
624. The division of two of the bands, bands b 2 and b 4 into 
split-bands is shown schematically in FIG. 7, by a dashed 
vertical line through the centers of these two bands. If this 
embodiment is implemented, the yardstick and additional 
coded coefficient are referred to herein as the major and 
minor yardstick coefficients respectively. This step 650 takes 
place between the selection of the major yardstick coeffi- 
cients at 608 and the encoding of the magnitude of any 
yardstick coefficients at 626. 

The magnitudes of the minor yardstick coefficients are 35 
also quantized accurately at 624. Because they are minor 
yardsticks, it is known that they are of no greater magnitude 
than the major yardstick coefficients. This fact can be used 
to save bits in their encoding. 

There are various ways to divide the entire frame into, for 40 
instance, sixteen bands. One is to divide the segment from 
the beginning into sixteen bands. The other is to divide the 
entire segment into two, and then divide each part into two, 
and so on, with information derived from the first division 
being more important than information derived from the 45 
second division. Using split bands thus provides a hierarchy 
of important information. The first division is more impor- 
tant than the second division, which is more important than 
the next division, etc. Thus, it may be beneficial to preserve 
bits for the more important divisions. 50 

As has been mentioned above, it may be beneficial to 
apply a second transformation to the yardsticks before 
quantizing, coding and transmitting at step 624, 626 and 628 
respectively. Ibis second transformation could be applied to 
bom major and minor yardsticks, or to either major or minor 55 
yardsticks alone. This is because, depending on the nature of 
the signal, there may be some pattern or organization among 
the yardstick coefficients. As is well known, transformations 
take advantage of a pattern in data to reduce the amount of 
data information that is necessary to accurately define the 60 
data. For instance, if each yardstick coefficient were simply 
twice the magnitude of the preceding coefficient, it would 
not be necessary to quantize, code and transmit the magni- 
tudes of all of the coefficients. It would only be necessary to 
code the magnitude of the first, and to apply a doubling 65 
function to the received coefficient for the required number 
of steps. 



it 

■:jt 



20 

Thus, at step 622, 652 or 654 (depending on which of 
magnitude, location and sign are being quantized 
accurately), it is decided whether or not to apply a second 
transformation to the yardstick coefficients according to a 
5 known method, such as the DCT. If the nature of the data is 
such that it is likely to provide a more compact mode of 
coding, then at steps 618, 616 or 620, another transformation 
is applied. FIG. 12a indicates that the transformation is a 
DCT transformation, however, any transformation that 
to achieves the goal of reducing the amount of data that mast 
be transmitted can be used. Other appropriate types of 
transformations include the Discrete Fourier Transform. 

It is because of this potential yardstick-only transforma- 
tion that it is not appropriate in all cases to conclude that 
15 according to the method of the invention, the higher accu- 
racy to which the yardstick coefficients are encoded is the 
result of devoting more bits to each yardstick coefficient (on 
average) man to each non-yardstick coefficient (on average). 
This is because the application of the yardstick-only trans- 
Ll 20 formation may result in a significant reduction in the number 
of bits necessary to encode all of the yardstick coefficients 
and thus of any single yardstick coefficient (on average). Of 
!;* course, this savings in bits is achieved due to an increase in 
I s computational requirements, both in encoding and decoding. 
I J 25 In some applications, the bit savings will justify the com- 
j putational burden. In others, it may not Both will be 
]'* apparent to those of ordinary skill in the art 
' * If the yardsticks are twice transformed, they must be 

inverse transformed back into the frequency domain of X(k) 
at 632 in order to simplify the calculations required for bit 
allocation at 634. 636 and design of reconstruction levels at 
638. as discussed above. Alternatively, rather than inverse 
transformation, the yardsticks can be stored in a memory in 
the encoder, and retrieved prior to step 634. 

During the decoding steps of the method of the invention, 
the exact manner of translation at step 666 from transmitted 
j non-yardstick codewords to quantization levels will depend 
on whether split bands have been used, whether location or 
location and sign of the yardstick coefficients have also been 
encoded accurately, and how that information was packaged. 
If side information is used to transmit control data, then that 
side information must be decoded and applied. If all of the 
information necessary is contained in memory accessible by 
the decoder, then the codewords need only be translated 
according to established algorithms. 

For instance, an established algorithm may set the number 
of coefficients per band in the first half of the frame at 
sixteen and the number of coefficients per band in the second 
30 half at thirty-two. Former a rule might be established to 
allocate bits within a band evenly among coefficients, with 
any extra bits being given, one to each of the first coefficients 
in the band. If the sign of the yardstick coefficient is 
quantized, then each coefficient may be divided into recon- 
55 struction levels with one additional reconstruction level 
having a sign that is the same as the yardstick coefficient 
In light of the foregoing detailed discussion of the method 
of the invention, the apparatus of the invention will be 
understood from FIG. 13a, showing the transmitter portion 
60 of the apparatus, and FIG. 13b, showing the receiver portion. 
The apparatus of the invention can be implemented in 
dedicated processors or a properly programmed general 
purpose digital computer. 
TDAC type transformer 802 transforms an audio-type 
65 signal, such as x(t) into a spectrum such as X(k). (A DCT 
transformer is also appropriate and within the contemplation 
of the invention.) The I l a operator scales the spectrum to a 
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domain more pertinent to human perception, or when non- 
uniform quantization is desired. Spectral band divider 806 
divides the scaled spectrum up into separate bands. Yard- 
stick coefficient identifier 808 identifies the coefficients in 
each band having the largest magnitude. Quantizers 810, and 5 
812 quantize the magnitude of the yardstick coefficients (and 
perhaps the sign) and, if desired, the location within the band 
respectively. DCT transformer 816 applies a DCT or similar 
transform to the quantized yardstick information, if it is 
determined that enough structure exists among the yardstick 1Q 
coefficients to justify the additional computation. Coder 818 
encodes the quantized yardstick information, whether or not 
the DCT transformer operates upon the information, pro- 
ducing a series of codewords, which are transmitted by 
transmitter 820 onto a data channel. J3 

In a preferred embodiment, band-wise bit allocator 822 
takes the information from the yardstick magnitude quan- 
tizers 810 and uses that information to establish a rough 
estimate of K(k)l a as shown in FIG. 7a, and uses this 
estimate to allocate the limited number of available bits ^ 
among the bands in the spectrum established by spectral 
band divider 806. Coefficient-wise bit allocator 824 uses the 
information from the yardstick position and sign quantizers 
812 and 814 along with the allocation of bits within the band 
to allocate the band's bits among the coefficients in that ^ 
band. Non-yardstick quantizer 826 uses the same informa- 
tion to establish appropriate reconstruction levels for each 
coefficient in the band and to quantize each coefficient The 
quantized coefficients are passed to coder 818, which assigns 
a codeword to each non-yardstick coefficient and passes the 20 
codewords on to transmitter 820 for transmission. 

In another preferred embodiment of the apparatus, the 
band-wise bit allocator can also take information from the 
yardstick position quantizer 812 in establishing the rough 
estimate of IX(k)l a . The band-wise bit allocator would 35 
establish a rough estimate as shown in FIG. 11 if the location 
information is used, and from this estimate, would allocate 
bits to the bands. 

In another embodiment of the apparatus of the invention, 
the bandwise bit allocator 822 also takes sign information 40 
from magnitude quantizer 810 and location information 
from location quantizer 812 to allocate bits to the band, as 
discussed above with respect to the method of the invention. 

The receiver or decoder portion of the invention is shown 
schematically in FIG. 12b. Receiver 920 receives the code- 45 
words from the communication channel Yardstick decoder 
918 decodes the yardstick data, resulting in quantized data 
that represents the yardsticks. Reverse DCT transformer 916 
undoes the effect of any DCT type transformation that was 
applied at 816, resulting in a set of scaled yardstick coeffi- 50 
cients that are very close in magnitude to the original scaled 
yardstick coefficients before quantization in magnitude 
quantizer 810. Non-yardstick decoder 926 receives the code- 
words representing the non-yardstick coefficients and trans- 
lates those coefficients into reconstructed non-yardstick 55 
coefficients. As has been mentioned above in connection 
with the method, the operation of decoder 926 will depend 
on the means by which the non-yardstick information was 
coded. Operator 904 raises the quantized coefficients in the 
reconstructed spectrum to the power of 1/ol to undo the 60 
effect of operator 804. Reverse transformer 902 applies an 
inverse transform to the spectrum to undo the effect of the 
TDAC transformer 802, and to transform the signal from the 
frequency domain back to a time domain, resulting in a 
windowed time domain segment Combiner 928 combines 65 
the separate sampled windows, and synthesizer 930 synthe- 
sizes an audio-type signal. 
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Another preferred embodiment of the encoder omits the 
band-wise bit allocator and includes only a coefficient-wise 
bit allocator, which takes the estimate of K(k)r and uses 
that to directly allocate bits to the coefficients, as described 
5 above with respect to the method of the invention. 

The foregoing discussion of method and apparatus has 
assumed that the yardstick coefficients are the coefficients 
having the maximum absolute value of amplitude in the 
band. It is also beneficial to use a coefficient other than the 

10 maximum magnitude as the reference yardstick against 
which the others are measured. For instance, although it is 
believed that optimal results will be achieved using the 
maximum amplitude coefficient, beneficial results could be 
obtained by using a coefficient having an amplitude near to 

13 the greatest such as the second or third greatest Such a 
method is also within the contemplation of the invention and 
is intended to be covered by the attached claims. 

0 The reference yardstick may also be the coefficient having 
t Fl a magnitude mat is closest among all of the magnitudes of 
g^j other coefficients in the band to the middle or median 

coefficient in the band. A middle value yardstick is beneficial 
Ul in cases where the statistical characteristics of the signal are 

1 f ! such that the miHrffc, or iwdfan value contains more infar- 
l'l mation about the total energy in the signal than does the 

maximum value in a band. This would be the case if the 
%J typical signal is characterized by excursions within a steady 
I f| range above and below a middle value. It would also be 
necessary to characterize or estimate a range for the mag- 
5 nitude of the excursions. For example, if the middle value of 
C5* a band had a value of positive five, and it were known from 
ri : | the statistics of the type of signal that such signal values 
r' J typically diverge from the median by only ±four units, the 
range would be set from positive one to positive nine, and 
%j reconstruction levels would be established within the range. 
l ; | As before, the reconstruction levels can be evenly divided, 
or can be concentrated more around the middle value, or 
*iJ skewed toward either end of the range, depending upon 
statistical information about the particular class of signal. 
^ Similarly, the yardstick coefficient may be the coefficient 
having a magnitude that is closest to the average of all of the 
magnitudes of the other coefficients in the band. Such an 
average value is useful if the average value represents a 
better estimate of the energy in the band than any other 
43 value, for instance the maxim inn or the median values. 

The invention has been discussed above with respect to a 
signal that has been divided into a plurality of bands, and this 
is expected to be the application for which the invention 
provides the greatest benefits. However, the invention is also 

50 useful in connection with coding the amplitudes of a plu- 
rality of coefficients in only a single band. Application of die 
invention to a signal or signal component on only a single 
band follows the same principles as the application to 
multi-band signals discussed above. The yardstick is 

55 selected, and quantized accurately, preferably although not 
necessarily encoding the location and the sign of the yard- 
stick. The accurate quantization of the yardstick is used in 
conjunction with the number of available bits to establish 
reconstruction levels and to allocate bits among the non 

go yardstick coefficients. All of the considerations discussed 
above apply to the single band embodiment, except that the 
number of bits available for the band will be determined, and 
will not depend on the specifics of other bands, if any. 
The present invention has many benefits. The bits related 

65 to bit allocation, such as the magnitude of the yardstick 
coefficient as well as their locations and signs, will be well 
protected. Thus, any error that occurs will be localized to 



